Medical Physics — Latest Matching Preprints

1

FLASH Radiotherapy is faster than a heartbeat: A compartmental model to illustrate the interplay between tissue oxygen perfusion and ultra-high dose rate effects.

Ballesteros-Zebadua, P.; Jansen, J.; Grilij, V.; Franco-Perez, J.; Vozenin, M.-C.; Abolfath, R.

2026-03-16 biochemistry 10.64898/2026.03.12.711443 medRxiv

Top 0.1%

39.1%

Show abstract

Ultra-high-dose-rate therapy enhances the protection of normal tissues and reduces side effects while effectively controlling tumors. This biological phenomenon is called the FLASH effect, and when observed, therapy is called FLASH Radiotherapy (FLASH-RT). Various hypotheses have been proposed to explain how ultra-high dose rates achieve these effects under different conditions, with the impact of tissue oxygen perfusion still needing further investigation. FLASH-RT involves brief exposure to radiation, which results in fewer heartbeats occurring during the irradiation period, which could lead to reduced tissue oxygen perfusion occurring during the treatment timeframe. Therefore, we developed a compartmental model to simulate oxygen transfer and its interaction with radiation. The proposed model consists of three compartments: 1) the heart and arteries; 2) the irradiated brains blood vessels and capillaries; and 3) the irradiated brain tissue. We employed a system of differential equations, incorporating experimental data from in vivo oxygen measurements using the Oxyphor probe in the brain, to fit the model parameters to the experimental results. This model shows how dose rate and oxygen perfusion could influence chemical processes such as lipid peroxidation, potentially leading to differential biological effects. Our analysis of lipid peroxidation as a function of dose rate revealed a sigmoidal dose-rate-response curve that correlates well with several published biological response datasets. Our results indicate that the differential chemical effects of FLASH-RT compared with conventional dose rates may depend on factors such as oxygen perfusion, consumption, and tissue oxygen tension. This suggests that the temporal dynamics of oxygen could play a crucial role in enhancing the therapeutic window for FLASH-RT treatments. Furthermore, it suggests that the magnitude of some observed FLASH effects may vary across tissues or tumors and across experimental models, given differential oxygen dynamics.

2

Evaluating the Large Language Model-Based Quality Assurance Tool for Auto-Contouring

Tozuka, R.; Akita, T.; Matsuda, M.; Tanno, H.; Saito, M.; Nemoto, H.; Mitsuda, K.; Kadoya, N.; Jingu, K.; Onishi, H.

2026-04-01 radiology and imaging 10.64898/2026.03.31.26349802 medRxiv

Top 0.1%

19.0%

Show abstract

Purpose: Manual verification of AI-based auto-contouring is labor-intensive and prone to fatigue-related errors. This study developed the large language model (LLM)-based automated Quality Assurance (QA) for auto-contouring (LAQUA) system using a multimodal LLM, Gemini 2.5 Pro, and evaluated its feasibility as a clinical primary screening tool to streamline the QA workflow. Methods: Twenty male pelvic CT scans from an open dataset were utilized. Three distinct auto-contouring software packages (OncoStudio, RatoGuide prototype and syngo.via) were evaluated. Auto-contouring results for each slice were exported as PDF images with overlaid contours and input into Gemini 2.5 Pro. The LLM was instructed to rate the contour quality on a 5-point clinical scale (5: Optimal; 4: Acceptable; 3: Suboptimal; 2: Unacceptable; redraw from scratch; 1: Unacceptable; organ not detected). Using evaluations by two board-certified radiation oncologists as ground truth, Spearman's rank correlation coefficients ({rho}) and weighted kappa coefficients ({kappa}) were calculated. Additionally, to assess screening performance, sensitivity and specificity were calculated by dichotomizing the scores into "Pass" and "Fail" using two different cutoffs (scores [≥] 3 and [≥] 4 as "Pass"). Finally, the alignment of the rationales provided by the LLM with the auto-contouring quality was evaluated by two board-certified radiation oncologists. This was conducted using a Likert scale assessing four domains (error detection, hallucination, clinical relevance, and anatomical understanding), each scored out of 2 points. Results: The LAQUA system demonstrated moderate to strong agreement with expert judgments across all evaluated organs ({rho}: 0.567 - 0.835; quadratic weighted {kappa} : 0.639 - 0.804), with the rectum showing the highest correlation. Regarding screening performance, a cutoff of [≥]3 as "Pass" achieved the highest sensitivity and specificity in specific subgroups, but with wide 95% confidence intervals (CIs). A cutoff of [≥]4 as "Pass" narrowed the CIs, yielding the highest sensitivity in the rectum (0.976) and the highest specificity in the left femoral head (0.933). Qualitatively, the LLM's rationales achieved an overall mean score of 1.70 {+/-} 0.48 (out of 2), with 155 of 291 outputs receiving perfect scores across all criteria. Conclusions: The LAQUA system demonstrated substantial agreement with expert evaluations in AI-based auto-contouring quality assessment. While potential overestimation bias (risk of missing "Fail" cases) warrants caution, the observed sensitivity suggests its feasibility as a primary screening QA tool to efficiently filter acceptable contours, thereby reducing the clinical workload.

3

Visual Fidelity-Driven Quality Assessment of Medical Image Translation

Bizjak, Z.; Zagar, J.; Spiclin, Z.

2026-03-20 radiology and imaging 10.64898/2026.03.18.26348721 medRxiv

Top 0.1%

18.9%

Show abstract

Automated and reliable image quality assessment (IQA) is essential for safe use of medical image synthesis in critical applications like adaptive radiotherapy, treatment planning, or missing-modality reconstruction, where unnoticed generative artifacts may adversely affect outcomes. We evaluated image-to-image translation quality by coupling large-scale expert visual quality assessment with explainable automated IQA modeling. Adversarial diffusion-based framework, SynDiff, was applied to four cross-modality synthesis tasks, including three inter-MR and a CBCT-to-CT translation. Using four-fold cross-validation, ten reference-based and eight no-reference IQA metrics were computed for all synthesized images. Visual IQA ratings were independently collected from thirteen expert raters using predetermined protocol and specialized image viewer enabling blinded, randomized six-point Likert scoring. Auto-Sklearn was employed to learn ensemble regression models mapping IQA metrics to visual consensus ratings, with separate models trained on reference-based and no-reference metrics. The models closely reproduced distribution and ordering of expert ratings, typically within +/- 0.5 Likert points. Reference-based models achieved higher agreement with visual ratings than no-reference models (R^2 0.75 vs. 0.59, resp.), although the latter remained unbiased and informative. Explainability analyses highlighted structure- and contrast-sensitive metrics as key predictors. Overall, the results demonstrate that ensemble regression models can provide transparent, scalable, and clinically meaningful quality control for generative medical imaging.

4

Automated Segmentation of Head and Neck Cancer from CT Images Using 3D Convolutional Neural Networks

Prabhanjans, P.; Punathil, A. N.; V K, A.; Thomas T, H. M.; Sasidharan, B. K.; Shaikh, H.; Varghese, A. J.; Kuchipudi, R. B.; Pavamani, S.; Rajan, J.

2026-03-13 radiology and imaging 10.64898/2026.03.12.26347996 medRxiv

Top 0.1%

18.5%

Show abstract

Head and neck cancer (HNC) requires accurate tumor delineation for effective radiotherapy planning. Manual segmentation of tumor regions is time-consuming and subject to considerable inter-observer variability. Although several automated approaches have been proposed, many rely on multimodal imaging such as PET/CT, which is expensive, less accessible in many clinical settings, and increases the burden on patients. In this work, we investigate a CT-only three-dimensional segmentation framework that provides a clinically practical and resource-efficient alternative. CT images of 136 head and neck cancer patients from the publicly available HN1 dataset in The Cancer Imaging Archive (TCIA) were used along with 30 additional cases from a private dataset collected at a tertiary care centre, Christian Medical College (CMC), Vellore, India. A fully automated segmentation model was developed to delineate the primary gross tumor volume (GTV) using the 3D nnU-Net framework. The models were trained using the HN1 dataset and an extended HN1+CMC dataset that included the additional private cases. Performance was evaluated using three-fold cross-validation with standard segmentation metrics including Dice Similarity Coefficient (DSC), Intersection over Union (IoU), and the 95th percentile Hausdorff Distance (HD95). The proposed CT-based model achieved a Global Dice of 0.63 and a Median Dice of 0.60 on the HN1 dataset. When the additional CMC cases were incorporated during training, the performance improved to a Global Dice of 0.65 and a Median Dice of 0.71. These results demonstrate that 3D nnU-Net can effectively segment head and neck tumors from CT images alone. The proposed CT-only approach provides a cost-effective and scalable solution that can support radiotherapy treatment planning and help reduce variability in clinical workflows.

5

Efficient and Practical Framework for Bias Estimation in Spectral CT

Sandvold, O. F.; Proksa, R.; Perkins, A. E.; Noël, P. B.

2026-03-12 radiology and imaging 10.64898/2026.03.11.26346993 medRxiv

Top 0.1%

15.2%

Show abstract

BackgroundSpectral computed tomography (CT) is increasingly used for quantitative imaging, yet accurate prediction of spectral quantitative bias remains challenging and computationally expensive with conventional approaches. Bias manifests as systematic deviations in reconstructed quantities (e.g., Hounsfield units, iodine density) from their true physical values. It arises from a combination of model mismatch, hardware/processing imperfections, exam-dependent factors, and noise-induced effects amplified by nonlinear operations such as the logarithmic transformation and material decomposition. PurposeWe present a practical, projection-based statistical framework to estimate noise-induced spectral bias efficiently, without the runtime burden of Monte Carlo (MC) simulation. MethodsTo demonstrate the bias estimator, we modeled the central-ray of a clinical X-ray tube attenuating through a 300 mm patient-equivalent path with a 10 mm insert containing 10 mg/mL iodine. A 120 kVp tube voltage and tube currents from 100-350 mA were used. Ideal and realistic photon-counting detector responses were simulated across 50 bin threshold settings. Quantum Poisson noise was modeled, and Bayesian probabilities of material decomposition outputs centered on ground truth iodine and water bases were computed. Expected material decomposition outputs [Formula] were derived from a 2D probability map, and bias was measured. A simple Python Monte Carlo (MC) simulation served as a reference. ResultsThe proposed bias estimator closely matched MC-derived bias, with an average relative iodine bias percent difference between the estimators of 0.44% across all tube currents and bin thresholds. Average runtime of the bias estimator was only 0.5% of the MC simulation. Optimal thresholds for minimizing iodine noise (via the Cramer-Rao lower bound) differed from those minimizing iodine bias, highlighting key noise-bias tradeoffs. ConclusionEfficient spectral bias and noise estimation are essential for quantitative CT system design. This modular framework enables rapid, bias-aware optimization of spectral acquisition parameters and is adaptable to alternative spectral CT technologies beyond photon counting. Novelty and Significance of StudyPlease briefly (150 words or less) describe the novelty and/or significance of your study. Bias estimation is paramount for designing accurate spectral CT systems that deliver improved diagnostic performance. Traditional approaches rely on computationally intensive Monte Carlo simulations. We propose an efficient and practical bias estimator that uses Bayesian statistics and expected material decomposition values derived from a flexible, modular CT forward model. Unlike conventional Monte Carlo approaches, this framework enables rapid exploration of spectral design tradeoffs between bias and noise. We demonstrate both the accuracy and speed of this bias estimator relative to Monte Carlo approaches.

6

Quantitative Dixon-Based PDFF and R2* Estimation and Optimization on MR-Simulation and MR-Linac Devices for the Pelvis and Head and Neck: A Prospective R-IDEAL Stage 0-2a Study

McCullum, L.; West, N. A.; Shin, K.; Taylor, B. A.; Augustyn, A.; Saifi, O.; Thrower, S.; Wang, J.; Shah, S.; Choi, S.; Anakwenze, C. P.; Fuller, C. D.; Floyd, W.

2026-03-10 radiology and imaging 10.64898/2026.03.09.26347965 medRxiv

Top 0.1%

15.0%

Show abstract

Background and PurposeThe use of MRI-based fat quantification can be applied to automatically identify red bone marrow which is highly sensitive to radiation and systemic therapies and could be used as an organ-of-interest for adaptive radiation therapy. Currently, the tradeoff of scan time and PDFF/R2* quantification accuracy from the 2-/3-/6-point methods, particularly for the time-constrained MR-Linac, remain unanswered. Therefore, the purpose of this study was to investigate the technical feasibility and quantitative performance of quantitative Dixon-based imaging for scanners within the radiation oncology department. Materials and MethodsA 2-/3-/6-point version of the quantitative Dixon sequence was developed and scanned on a 1.5T MR-Simulation, 3T MR-Simulation, and 1.5T MR-Linac scanner for five repetitions using the Calimetrix Model 725 PDFF-R2* phantom as a nominal reference for quantitative PDFF/R2* values. The image geometric distortion as well as the quantitative concordance, Bland-Altman agreement, repeatability, and reproducibility of both the PDFF/R2* values were determined. Each sequence was evaluated in both the pelvis and head and neck across both healthy volunteers and patients. ResultsThe most severe geometric distortion was less than 2 mm except for the 1.5T MR-Linac when using the 2-point Dixon sequence with distortions exceeding 5 mm. The 6-point Dixon sequence showed the highest concordance at above 0.97 across all scanners for both PDFF and R2* followed by the 3-point and 2-point sequence. The 2-point Dixon sequence exhibited significant PDFF biases particularly at the higher R2* values since it did not correct for it during reconstruction. For the Bland-Altman analysis, the 2-point Dixon sequence had the widest 95% limits of agreement followed by the 3-point and 6-point Dixon sequence with the narrowest bands. The goodness-of-fit is generally lowest at higher PDFF values and lower R2* values. Both repeatability and reproducibility were the lowest for the 6-point Dixon sequence. DiscussionThe 6-point quantitative Dixon sequence demonstrated superiority for the chosen evaluation metrics. The results of this work can be used to determine the threshold for true quantitative changes of PDFF/R2* while considering acquisition variabilities, enabling future biomarker studies and clinical trials. Further, this work provides validation for future investigations into quantitative bone marrow characterization. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=81 SRC="FIGDIR/small/26347965v1_ufig1.gif" ALT="Figure 1"> View larger version (35K): org.highwire.dtl.DTLVardef@8b2139org.highwire.dtl.DTLVardef@322a97org.highwire.dtl.DTLVardef@18a3a46org.highwire.dtl.DTLVardef@1f7ef62_HPS_FORMAT_FIGEXP M_FIG C_FIG

7

The Effects of External Laser Positioning Systems for MRI Simulation on Image Quality and Quantitative MRI Values

McCullum, L.; Ding, Y.; Fuller, C. D.; Taylor, B. A.

2026-03-07 radiology and imaging 10.64898/2026.03.06.26347809 medRxiv

Top 0.1%

12.0%

Show abstract

Background and PurposeMagnetic resonance imaging (MRI) for radiation therapy treatment planning is currently being used in many anatomical sites to better visualize soft tissue landmarks, a technique known as an MRI simulation. A core component of modern MRI simulation configurations are the use of external laser positioning systems (ELPS) to help set up the patient. Though necessary for accurate and reproducible patient setup, the ELPS, if left on during imaging, may interfere negatively with image quality due to leaking electronic noise, of which MRI is sensitive to. It is currently unknown whether this leakage of electronic noise may further affect quantitative values derived from clinically employed relaxometric, diffusion, and fat fraction sequences. Therefore, in this study, we aim to characterize the impact of MRI simulation lasers on general image quality and quantitative imaging accuracy. Materials and MethodsFirst, a cine acquisition was used to visualize the real-time changes in image signal-to-noise ratio (SNR) from when the ELPS was deactivated to activated. To validate this effect quantitatively, the SNR was measured using the American College of Radiology (ACR) recommended protocol in a homogeneous phantom with the integrated body, 18-channel UltraFlex small, 18-channel UltraFlex large, 32-channel spine, and 16-channel shoulder coils. Next, a geometric distortion algorithm was tested in two vendor-provided phantoms while using the integrated body coil and the ACR Large Phantom protocol was tested. Finally, a series of quantitative MRI scans were performed using a CaliberMRI Model 137 Mini Hybrid phantom to validate quantitative T1, T2, and ADC while a Calimetrix PDFF-R2* phantom was used for quantitative PDFF and R2*. All scans were performed with both the ELPS both deactivated and activated. ResultsVisible electronic noise artifacts were seen when using the integrated body coil when the ELPS was activated on the cine acquisition which led to a four-fold decrease in SNR using the ACR protocol. This SNR drop was not seen when using the remaining tested coils. The automatic fiducial detection algorithm was affected negatively by ELPS activation leading to misidentification when identified perfectly with the ELPS deactivated. Degradation in image intensity uniformity, percent signal ghosting, and low contrast object detectability was seen during ACR Large Phantom testing using the 20-channel Head/Neck coil. Concordance across quantitative MRI values was similar when the ELPS was both deactivated and activated while a consistent increase in standard deviation inside the ADC vials was seen when the ELPS was activated. DiscussionThe extra noise induced from the activation of the ELPS during imaging should be avoided due to its potential to unnecessarily increase image noise. This is particularly true when conducting mandatory quality assurance testing for image quality and geometric distortion which utilize the integrated body coil which is most susceptible to ELPS-induced noise. Clear clinical guidelines should be implemented to make this issue known to the MRI technologists, physicists, and other relevant staff using an MRI with a supplementary ELPS for patient alignment. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=113 SRC="FIGDIR/small/26347809v1_ufig1.gif" ALT="Figure 1"> View larger version (44K): org.highwire.dtl.DTLVardef@dd725borg.highwire.dtl.DTLVardef@7ed081org.highwire.dtl.DTLVardef@1aac775org.highwire.dtl.DTLVardef@10ce397_HPS_FORMAT_FIGEXP M_FIG C_FIG

8

A quantitative proteomics dataset for assessment and prediction of low dose X-ray radiation exposure in mice.

Zelter, A.; Riffle, M.; Merrihew, G. E.; Mutawe, B.; Shulman, N.; Sanders, J. A.; Noble, W. S.; Johnson Erickson, D. P.; Morimoto, A.; Shaver, B. A.; Steins, T. N.; Cao, N.; Ford, E. C.; Rudnick, P. A.; Chelsky, D.; Wan, K. H.; Inman, J. L.; Chang, H.; Snijders, A. M.; Mao, J.-H.; Celniker, S. E.; De Chant, J.; Obst-Huebl, L.; Nakamura, K.; Wu, C. C.; MacCoss, M. J.

2026-05-19 biochemistry 10.64898/2026.05.18.725951 medRxiv

Top 0.1%

10.8%

Show abstract

Ionizing radiation induces molecular responses that may be used to estimate exposure when physical dosimeters are unavailable. Here we present two large-scale proteomics datasets generated from mouse dorsal skin punch samples collected following controlled X-ray exposures spanning multiple doses, dose rates, and post-exposure time points. Experiment 1 comprised 96 samples (including 16 reference samples) collected 6 days after exposure to 0-75 cGy delivered at either 30 or 300 cGy/min. Experiment 2 comprised 936 samples (including 236 reference samples) exposed to 0-100 cGy at either 3 or 28 cGy/min dose rates and harvested between 7 and 150 days post-exposure. All samples were processed using a standardized workflow involving automated bead-based digestion and data-independent acquisition mass spectrometry. The datasets include multiple pooled reference sample types, process controls, and system suitability standards ensuring high quality data. All data presented are available via ProteomeXchange at several levels of processing, from raw files through normalized peptide- and protein-level abundance matrices suitable for biomarker discovery and machine learning applications. This dataset will facilitate generation of new insights into the biological changes and molecular signatures resulting from X-ray exposure in mice and may also help inform future studies in humans.

9

Analysis Of Augmentation Techniques for Spine X-Ray Images

Sivakumar, E.; Anand, A.

2026-04-17 radiology and imaging 10.64898/2026.04.15.26350121 medRxiv

Top 0.1%

10.7%

Show abstract

Computer vision and deep learning techniques, including convolutional neural networks (CNNs) and transformers, have increased the performance of medical image classification systems. However, training deep learning models using medical images is a challenging task that necessitates a substantial amount of annotated data. In this paper, we implement data augmentation strategies to tackle dataset imbalance in the VinDr-SpineXR dataset, which has a lower number of spine abnormality X-ray images compared to normal spine X-ray images. Geometric transformations and synthetic image generation using Generative Adversarial Networks are explored and applied to the abnormal classes of the dataset, and classifier performance is validated using VGG-16 and InceptionNet to identify the most effective augmentation technique. Additionally, we introduce a hybrid augmentation technique that addresses class imbalance, reduces computational overhead relative to a GAN-only approach, and achieves [~]99% validation accuracy with both classifiers across all three case studies.

10

Scan length as a major driver of CT radiation dose: a diagnostic reference level audit from Kosovo

Rudi, G.; Vula, F.; Bicaku, A.; Dedushi, K.; Ahmetgjekaj, I.

2026-05-17 radiology and imaging 10.64898/2026.05.12.26353024 medRxiv

Top 0.1%

10.6%

Show abstract

Computed tomography is the largest contributor to population radiation dose from medical imaging, yet no diagnostic reference levels (DRLs) have been published from Kosovo or the Western Balkans. This retrospective audit analyzed all CT examinations performed on a 128- slice scanner at the University Clinical Centre of Kosovo between January and March 2026. After exclusions, 1,535 acquisitions from 1,092 patients across nine examination categories were analyzed. Local DRLs were defined as the 75th percentile and compared against German (BfS 2022) and Turkish (Kahraman et al., 2024) reference values. Head CT (n = 590) demonstrated CTDIvol 4.7% below the BfS DRL yet scan length 98.5% above the orientation value (median 25.8 vs 13 cm). Abdomen-pelvis CTDIvol matched the BfS reference while scan length exceeded it by 28%. Coronary CTA showed CTDIvol +377%, consistent with retrospective ECG gating. Excess scan length, not CTDIvol, is the major driver of elevated dose at this institution. The identified excesses are correctable through technologist landmarking training, protocol review, and enabling iterative reconstruction.

11

Technical Development and Implementation of 3D-QALAS on a 1.5T MR-Linac for the Brain: A Prospective R-IDEAL Stage 0/1 Technology Development Report

McCullum, L.; Harrington, A.; Taylor, B. A.; Hwang, K.-P.; Fuller, C. D.

2026-03-10 radiology and imaging 10.64898/2026.03.09.26347967 medRxiv

Top 0.1%

10.5%

Show abstract

Background and PurposeQuantitative relaxometry on the integrated MRI / linear accelerator (MR-Linac) at high isotropic resolution is currently limited due to prohibitively long scan times and limited field-of-views. Therefore, the purpose of this study was to assess the technical feasibility of the 3D-QALAS technique on the 1.5T MR-Linac which has the ability to acquire whole-brain 1 mm isotropic quantitative T1, T2, and PD maps along with multiple synthetic images in a 7 minute acquisition time. Materials and MethodsA 1 mm isotropic 3D-QALAS acquisition was scanned in both phantoms and a healthy volunteer on the 1.5T Elekta Unity MR-Linac device with scan times around seven minutes. A test-retest protocol across five independent sessions for the phantom was conducted. The correlation, repeatability, and reproducibility between measured and reference quantitative T1, T2, and PD values were determined in the phantom. Distortion was also studied. Vendor-provided reconstruction through SyMRI was performed to extract synthetic images and brain volume metric assessments on a healthy volunteer. ResultsThe slope and concordance between the measured and phantom reference values was 1.02 (1.00), 1.09 (0.90), and 0.99 (1.00) for T1, T2, and PD, respectively. Median distortion across the phantom remained below 2 mm. The repeatability and reproducibility coefficient-of-variation (CoV) was under 8% for all measured values. The measured brain volumes in the healthy volunteer was within expected age-adjusted reference values. DiscussionThe technical feasibility of using 3D-QALAS on the integrated 1.5T MR-Linac was confirmed. Applying this technique to the head and neck adaptive radiation therapy workflow will provide new opportunities to integrate quantitative imaging relaxometry biomarkers at 1 mm isotropic resolution. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=113 SRC="FIGDIR/small/26347967v1_ufig1.gif" ALT="Figure 1"> View larger version (48K): org.highwire.dtl.DTLVardef@1f43093org.highwire.dtl.DTLVardef@a1320eorg.highwire.dtl.DTLVardef@dd750eorg.highwire.dtl.DTLVardef@1300853_HPS_FORMAT_FIGEXP M_FIG C_FIG

12

Three-dimensional printing of lifelike PET phantoms

Ge, Y.; Li, E. J.; McDonald, S.; Geagan, M.; Parma, M. J.; Gao, M.; Mei, K.; Pasyar, P.; Im, J. Y.; Muller, F. M.; Pantel, A. R.; Karp, J. S.; Noel, P. B.

2026-05-14 radiology and imaging 10.64898/2026.05.11.26352857 medRxiv

Top 0.1%

10.5%

Show abstract

BackgroundRealistic PET/CT phantoms are essential for system evaluation, protocol optimization, and validation of advanced reconstruction methods. However, existing phantoms are often limited by simplified geometries, spatially uniform activity patterns, and complex preparation procedures. PurposeTo develop and evaluate PixelPrintPET, a 3D printing-based method for fabricating anatomically realistic PET/CT phantoms with spatially heterogeneous radiotracer distributions and a single-solution filling workflow that avoids physical compartmentalization. MethodsPixelPrintPET generates voxel-based printing instructions that encode spatially varying infill, which is realized during printing through modulation of filament extrusion, enabling heterogeneous activity distributions without compartmentalization of radioactivity at different activity concentrations. Calibration phantoms and anatomically structured phantoms were designed and printed using high-flow polylactic acid (PLA), with anatomical inputs derived from either digital atlas-based models or patient imaging data. The printed phantoms were subsequently filled by immersion in a radioactive solution, allowing activity distribution to be controlled by the internal porous structure. A bottom-up filling procedure with reduced surface tension was developed to ensure uniform infiltration and minimize air entrapment. Phantoms were imaged on the PennPET Explorer PET/CT system, and quantitative performance was evaluated using contrast recovery coefficient (CRC), target-to-background ratio (TBR), and comparisons with simulated or patient-derived reference data. ResultsA strong linear relationship between infill ratio and normalized signal (R2 = 0.998) was demonstrated by the calibration phantom, enabling reliable mapping between structure and activity. Additionally, air entrapment was minimized to less than 1% of the total phantom volume. In the contrast recovery phantom, CRC values were consistent with measurements using traditional phantoms. The brain phantom reproduced atlas-derived contrast patterns, with gray-to-white matter differences within 5% after accounting for resolution and other system effects. The patient-based thorax phantom showed high reproducibility across repeated scans, with differences within 3%, and closely matched the input patient image with regional differences within 10% in all regions except the lung. ConclusionsPixelPrintPET enables the fabrication of realistic, reproducible, and versatile PET/CT phantoms with a voxel-level control of the activity distribution. This approach provides a practical solution for generating patient-specific and application-specific phantoms, with the potential to accelerate system validation, protocol development, and clinical translation of advanced PET/CT technologies.

13

Generalizable Deep Learning Framework for Radiotherapy Dose Prediction Across Cancer Sites, Prescriptions and Treatment Modalities

Chang, H.-h.; Cardan, R.; Nedunoori, R.; Fiveash, J.; Popple, R.; Bodduluri, S.; Stanley, D. N.; Harms, J.; Cardenas, C.

2026-04-22 radiology and imaging 10.64898/2026.04.17.26350770 medRxiv

Top 0.1%

10.3%

Show abstract

Optimizing radiotherapy dose distributions remain a resource-intensive bottleneck. Existing AI-based dose prediction methods often have limited generalizability because they rely on small, heterogeneous datasets. We present nnDoseNetv2, an auto-configured, end-to-end framework for dose prediction across diverse disease sites (head and neck, prostate, breast, and lung), prescription levels (1.5-84 Gy), and treatment modalities (IMRT, VMAT, and 3D-CRT). By integrating machine-specific beam geometry with 3D structural information, the framework is designed to generalize across varied clinical scenarios. A single multi-site model was trained on 1,000 clinical plans. On sites seen during training, performance was comparable to specialized site-specific models. On unseen sites (liver and whole brain), the model outperformed site-specific models, with mean absolute errors of 2.46% and 6.97% of prescription, respectively. These results suggest that geometric awareness can bridge disparate anatomical domains while eliminating the need for site-specific model maintenance, providing a scalable and high-fidelity approach for personalized radiotherapy planning.

14

Retrieval-Augmented Claude Opus 4.7 and GPT-5.5 Surpass Human Performance on the Nuclear Cardiology Board Preparation Exam (and Claude Drafts a Paper About it)

Killekar, A.; Shanbhag, A.; Miller, R. J.; Dey, D.; Bourque, J.; Phillips, L.; Chareonthaitawee, P.; Slomka, P.

2026-05-13 radiology and imaging 10.64898/2026.05.08.26352768 medRxiv

Top 0.1%

9.5%

Show abstract

BackgroundPrevious studies evaluated large language model (LLM) performance on the American Society of Nuclear Cardiology (ASNC) Board Preparation Exam. Without domain-specific context, the best model (GPT-4o) achieved 63.1%, below the estimated 65% passing threshold and the 78% mean score of human fellows-in-training (FITs). Providing textbook context improved GPT-4o to 73.8% on text-only questions, but still fell short of human trainees. Whether next-generation LLMs with retrieval-augmented generation (RAG) can exceed this gap is unknown. MethodsClaude Opus 4.7 and GPT-5.5 were administered all 168 questions (141 text-only, 27 image-based) from the 2023 ASNC Board Preparation Exam across 5 iterations each, using RAG with a nuclear cardiology textbook, companion atlas, and ASNC clinical guidelines. Claude used local FAISS-based semantic retrieval; GPT-5.5 used Azures cloud-hosted vector store. Performance was compared to prior LLM results and 13 human FITs. ResultsAcross 5 iterations, Claude Opus 4.7 achieved a mean accuracy of 86.3% {+/-} 1.4% (text 88.8%, image 73.3%). GPT-5.5 achieved 86.7% {+/-} 2.2% (text 88.5%, image 77.0%) but refused a mean of 12.2 questions (7.3%) per iteration due to safety filters. Both models surpassed the human FIT mean (78.0%) and the estimated passing threshold. Compared to GPT-4o without context (63.1%), this represents a 23-percentage-point improvement in 18 months. ConclusionNext-generation LLMs with RAG now surpass average human trainee performance on nuclear cardiology board preparation questions, suggesting significant potential as educational tools and knowledge-reference aids in cardiovascular imaging. Condensed AbstractAcross 5 iterations each, Claude Opus 4.7 and GPT-5.5 with retrieval-augmented generation achieved mean accuracies of 86.3% and 86.7% on the 2023 ASNC Board Preparation Exam (168 questions), both surpassing the mean human fellow-in-training score of 78%. GPT-5.5 refused a mean of 12.2 questions (7.3%) per iteration due to safety filters. These results represent a 23-percentage-point improvement over the best prior LLM without context (63.1%), demonstrating that RAG-enhanced LLMs have reached human-level proficiency in nuclear cardiology knowledge. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=111 SRC="FIGDIR/small/26352768v2_ufig1.gif" ALT="Figure 1"> View larger version (49K): org.highwire.dtl.DTLVardef@5f2465org.highwire.dtl.DTLVardef@4e80d3org.highwire.dtl.DTLVardef@1ebbb93org.highwire.dtl.DTLVardef@167d3c1_HPS_FORMAT_FIGEXP M_FIG C_FIG Overview of the three-study research arc evaluating LLM performance on the 2023 ASNC Board Preparation Exam. Study 1 (2024) tested four LLMs without context (best: GPT-4o, 63.1%). Study 2 (2025) added textbook context to GPT-4o (73.8%). Study 3 (2026, current) evaluated Claude Opus 4.7 and GPT-5.5 with retrieval-augmented generation across 5 iterations each (mean 86.3% and 86.7%, respectively), both surpassing the human fellow-in-training mean of 78%. Right panel shows the performance scale with key thresholds.

15

Quantitative imaging of the central lymphatic system with spectral CT iodine mapping: a feasibility study in swine

Liu, L. P.; Gurevich, A.; McClung, G.; Itkin, M.; Noël, P. B.

2026-05-07 radiology and imaging 10.64898/2026.05.06.26352364 medRxiv

Top 0.1%

8.6%

Show abstract

PurposeImaging of the central lymphatic system enables characterization of patient-specific lymphatic anatomy and accurate localization of leaks. Advancements in CT technology, particularly spectral CT, can enhance CT lymphangiography (CTL) with improved visualization and quantification. This study aimed to assess the feasibility of spectral CTL in both static and dynamic scans. Materials and Methods50% diluted iodinated contrast was injected into the bilateral superficial inguinal lymph nodes of a pig. The pig was scanned with a dual-layer spectral CT every 60 seconds for 10 minutes. To optimize contrast and visualize peristalsis, a second animal was injected with 25% and 10% diluted contrast and scanned dynamically 4 and 6.25 minutes after contrast injection. Conventional images and iodine maps were reconstructed to calculate the contrast-to-noise ratio (CNR). Additionally, the iodine density was measured adjacent to the lymphovenous junction to show fluctuations from peristalsis and contrast washout. ResultsIodine maps, compared to conventional images, separated the contrast-filled central lymphatic system from surrounding soft tissue and increased CNR to 895 compared to 43 with conventional images. 25% diluted contrast provided the best balance between visualization and quantification of the central lymphatic system, showing high and low iodine density regions corresponding to peristalsis. Iodine density peaked at 15.4 {+/-} 0.6 mg/mL and decreased to 2.0 {+/-} 0.1 mg/mL at 10.5 minutes. ConclusionSpectral CTL not only improves visualization of the central lymphatic system compared to CTL but also provides quantitative information for physiological characterization of lymphatic disease that can enhance current subjective assessment. Research highlights- Iodine maps from spectral CT lymphangiography separated contrast-filled lymphatic structures from surrounding soft tissue and provided better contrast-to-noise compared to conventional images. - Spectral CT lymphangiography enabled quantification of contrast in the central lymphatic system that demonstrated contrast washout and may be utilized for physiological characterization of disease. - Dynamic spectral CT imaging of the lymphatic system visually showed peristalsis in the thoracic duct and was further reflected in quantitative iodine density measurements.

16

Automated Anatomy-Based Subsegmentation of Pelvic and Proximal Femoral CT: Validation Across Clinically Relevant Regions and Landmarks

Rashed, M.; Alabdulrahman, H.

2026-05-19 radiology and imaging 10.64898/2026.05.14.26353237 medRxiv

Top 0.1%

8.5%

Show abstract

Background Automated pelvic CT segmentation has advanced to reliable coarse bone extraction. Yet the structured anatomical hierarchy required for morphometry, fixation planning, bone quality mapping, and arthroplasty workflows remains unachieved. This study developed and validated a fully automated anatomy-informed pipeline that converts standard pelvic CT into a comprehensive, surgeon-readable subsegmentation of the pelvis and proximal femur. Methods Pelvic CT datasets were retrospectively collected from anonymized archives of hospitals affiliated with the Directorate of Health Affairs, Sharqia, Egypt. After eligibility screening, 757 normal adult cases were processed using a custom one-click 3D Slicer pipeline integrating TotalSegmentator for coarse extraction, followed by deterministic anatomy-based subsegmentation into 81 segments. One hundred randomly selected cases were validated against expert-corrected reference segmentations using Dice similarity coefficient, volume difference, surface distance metrics, and bilateral symmetry analysis. Results Of 1,316 screened cases, 757 met eligibility criteria. Across 8,100 case-segment observations, the pipeline achieved a mean Dice of 0.9926 +/- 0.0465. Complete agreement was observed for the sacrum, ilium, acetabulum, anterior and posterior columns, sciatic buttress, and all landmarks. Relative decreases were confined to boundary-dependent regions. Bilateral symmetry analysis confirmed a median surface agreement of 99.85% within 5 mm. Conclusion The pipeline demonstrated high accuracy and reproducibility across a large normal adult dataset, establishing a structured anatomical foundation for quantitative pelvic analysis and surgical planning workflows. Clinical feasibility across abnormal anatomy and decision-level applications awaits dedicated validation.

17

A Cardiac Contouring Atlas of the Left Ventricle Myocardial Walls on CT

Wei, J.; Abdollahi, A.; Knoll, M.; Furkel, J.

2026-05-07 cardiovascular medicine 10.64898/2026.05.06.26352374 medRxiv

Top 0.1%

7.5%

Show abstract

Background and purposePrecise manual annotation of the left ventricular myocardial (LVM) wall is essential for cardiac substructure research, wall-specific radiation dosimetry, and segmentation model development. However, existing radiotherapy-oriented atlases and conventional CT viewing planes lack an explicit framework for reproducible, wall-level LVM delineation. To address this gap, we developed an anatomy-guided manual segmentation protocol for delineating the five LVM walls on non-contrast-enhanced CT (NECT) or contrast-enhanced CT (CECT) scans. Materials and methodsThis protocol was developed using 60 chest CT scans from two prospective cohorts at Heidelberg University Hospital, including 50 CECTs from IMRT-MC2 cohort and 10 NECTs from MAGELLAN cohort. Manual contouring was performed in 3D Slicer. Segmentation rules were established through review by a radiation oncologist and a cardiology expert, based on the American Heart Association 17-segment model, and were tested on additional CT scans before final protocol definition. ResultsThe protocol centers on three geometric steps: (1) defining the LV long axis using the endocardial apex and the center of the mitral annulus; (2) constructing an apical delimitation plane based on LV geometry; and (3) partitioning wall regions via intersections of the right ventricular and LV cavity centers in the short-axis view. This workflow enables structured segmentation of the anterior, septal, lateral, inferior, and apical LVM walls, supporting anatomically coherent 3D reconstruction. ConclusionThis study provides contouring steps and a representative atlas as a methodological basis for standardized annotation, with potential applications in dose-mapping cardiotoxicity analysis and deep-learning modeling for radiotherapy.

18

Feature-Based Parametric Response Mapping on Thoracic Computed Tomography for Robust Disease Classification in COPD

Namvar, A.; Shan, B.; Hoff, B.; Labaki, W. W.; Murray, S.; Bell, A. J.; Galban, S.; Kazerooni, E. A.; Martinez, F. J.; Hatt, C. R.; Han, M. K.; Galban, C. J.; Ram, S.

2026-04-27 radiology and imaging 10.64898/2026.04.24.26351675 medRxiv

Top 0.1%

7.5%

Show abstract

PurposeTo develop an interpretable feature-based Deep Parametric Response Mapping (PRMD) method that combines wavelet scattering convolution networks and machine learning to spatially detect and quantify functional small airways disease (fSAD) and emphysema on paired inspiratory-expiratory CT scans, with enhanced noise robustness. Materials and MethodsIn this retrospective analysis of prospectively acquired data (2007-2017), we developed and validated a deep learning-based PRM approach using paired CT scans from 8,972 tobacco-exposed COPDGene participants ([≥]10 pack-years; mean age 60.1 {+/-} 8.8 years; 46.5% women), including controls with normal spirometry (n = 3,872; controls), PRISm (n = 1,089), GOLD 1-4 COPD (n = 4,011). Data were stratified into training, validation, and testing sets (24:6:70). PRMD extracts translation-invariant image features using a wavelet scattering network and applies a subspace learning classifier to classify voxels as emphysema or non-emphysematous air trapping (fSAD). PRMD was compared with conventional density-based PRM for voxel-wise agreement, correlation with pulmonary function, robustness to noise, and sensitivity to misregistration using Pearson correlation, Bland-Altman analysis, and paired t tests. ResultsPRMD achieved 95% voxel-wise agreement with standard PRM (r = 0.98) while demonstrating significantly greater robustness under noise. PRMD showed stronger correlations with FEV (emphysema: r = -0.54; fSAD: r = -0.51; P < 0.0001) than standard PRM (r = -0.42 for both; P < 0.0001). Under simulated high-noise conditions, standard PRM overestimated disease by [~]15%, whereas PRMD limited error to < 5% (P < 0.001). ConclusionPRMD provides an interpretable, feature-driven and noise-resilient alternative to traditional PRM for emphysema and fSAD classification, enhancing the reliability of CT-based COPD phenotyping for multi-center studies and low-dose imaging applications. Key PointsO_LIThis study introduces combined wavelet scattering and subspace learning for medical image segmentation, enabling accurate, interpretable voxel-level classification of emphysema and functional small airways disease on paired CT scans. C_LIO_LIThe proposed Deep Parametric Response Mapping method demonstrated 95% voxel-wise agreement with standard Parametric Response Mapping and stronger correlations with spirometric measures, enhancing the clinical relevance of CT-based phenotyping for Chronic Obstructive Pulmonary Disease. C_LIO_LIDeep Parametric Response Mapping significantly improved robustness to image noise--reducing overestimation of emphysema and functional small airways disease from [~]15% to <5% (P < 0.001)--and benefits from reduced data requirements due to the fixed, mathematically defined filters used in wavelet scattering. C_LI Summary StatementDeep Parametric Response Mapping improves the accuracy and noise robustness of CT-based classification of emphysema and functional small airways disease using feature-based representations, enhancing the reliability of COPD phenotyping.

19

Improving Glioblastoma Classification Using Quantitative Transport Mapping with a Synthetic Data Trained Deep Neural Network

Romano, D. J.; Roberts, A. G.; Weppner, B.; Zhang, Q.; John, M.; Hu, R.; Sisman, M.; Kovanlikaya, I.; Chiang, G. C.; Spincemaille, P.; Wang, Y.

2026-04-01 radiology and imaging 10.64898/2026.03.31.26349864 medRxiv

Top 0.1%

6.7%

Show abstract

Purpose: To develop a deep neural network-based, AIF-free, perfusion estimation method (QTMnet) for improved performance on glioma classification. Methods: A globally defined arterial input function (AIF) is needed to recover perfusion parameters in the two-compartment exchange model (2CXM). We have developed Quantitative Transport Mapping (QTM) to create an AIF-independent estimation method. QTM estimation can be formulated using deep neural networks trained on synthetic DCE-MRI data (QTMnet). Here, we provide a fluid mechanics-based DCE-MRI simulation with exchange between the capillaries and extravascular extracellular space. We implemented tumor ROI generation to morphologically characterize tissue perfusion. We compared our QTMnet implementation with 2CXM on 30 glioma human subjects, 15 of which had low-grade gliomas, and 15 with high-grade glioblastomas. Results: QTMnet outperforms (best AUC: 0.973) traditional 2CXM (best AUC: 0.911) in a glioma grading task. Conclusion: The AIF-independent QTMnet estimation provides a quantitative delineation between low-grade and high-grade gliomas.

20

Assessment of patient radiation dose in conventional lumbar spine radiography: A multicenter study in the Souss Massa region, Morocco

SOUDI, A.; MENHOUR, Y.

2026-03-26 radiology and imaging 10.64898/2026.03.24.26349174 medRxiv

Top 0.1%

6.6%

Show abstract

BackgroundPatient radiation exposure in diagnostic radiology is an important concern for radiation protection and patient safety. Monitoring radiation dose levels during radiographic examinations is essential to ensure compliance with diagnostic reference levels (DRLs) and to optimize radiological practices. ObjectiveThe aim of this study was to evaluate patient radiation dose during conventional lumbar spine radiography and compare the obtained values with diagnostic reference levels. MethodsA descriptive cross-sectional multicenter study was conducted in four hospitals in the Sous Massa region, Morocco, between April and June 2017. Data were collected from 142 patients undergoing lumbar spine radiography examinations and from 20 radiology technicians. Exposure parameters including tube voltage, tube current, exposure time, focus-to-film distance, and field size were recorded. Entrance surface dose (ESD) was estimated using MICADO software, and dose area product (DAP) values were subsequently calculated. The 75th percentile values were determined and compared with diagnostic reference levels. ResultsThe regional 75th percentile ESD values were 5.33 mGy for the anteroposterior projection and 7.38 mGy for the lateral projection. Corresponding DAP values were 1840.9 mGy.cm2 and 2783.65 mGy.cm2, respectively. All obtained values were below the diagnostic reference levels used for comparison. However, variations between hospitals were observed, likely due to differences in imaging protocols and equipment. ConclusionRadiation doses associated with lumbar spine radiography in the evaluated hospitals were within acceptable limits according to diagnostic reference levels. Continuous monitoring of patient radiation exposure and optimization of radiographic techniques remain essential to ensure effective radiation protection.